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ABSTRACT 

A  three-dimensional,  time  dependent  Navler-Stokes  code  using  MacCormack's  explicit 
scheme  has  been  vectorlaed  for  the  CRAY-1  computer.  Computations  were  performed  for  a 
'  turbulent,  transonic,  normal  shock  wave  boundary  layer  Interaction  In  a  wind  tunnel 
diffuser.  The  vectorized  three-dimensional  Navler-, Stokes  code  on  the  CRAY-1  computer 
achieved  a  speed  of  128  times  that  of  the  original  scalar  code  processed  by  a  CYBER  74 
■  compnter.  The  vectorized  version  of  the  code  outperforms  the  scalar  code  on  the  CRAY 
coiqniter  by  a  factor  of  8.13.  A  comparison  between  the  experimental  data  and  the 
aramerical  simulation  Is  also  made. 
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Speed  of  Sound 
Deformation  Tensor 
Specific  Internal  Energy 
c  T  +  (n2  +  v2  +  w2)/2 

Vector  Fluxesi  Equation  (15) 
Differencing  Operator 

Mach  Nundser 
Static  Pressure 
Bate  of  Heat  Transfer 
Reynolds  Number  Based  on  Run¬ 
ning  Length  P„u„x/D„ 

Static  Temperature 
time 

Dependent  Variables  in  Vector 
Form  (p,  pu,  pv,  pw,  pe) 
Velocity  Vector 
Velocity  Components  in 
Cartesian  Frame  ' 

Coordinates  in  Cartesian 
Frame 

Transformed  Coordinate  System, 
Equation  (14) 

Density 

Stress  Tensor 

INTRODUCTION 


-In  the  past  decade,  computational  fluid 
dynamics  has  become  firmly  established  as 
-a  credible  tool  for  aerodynamics 
research^*^.  Aided  by  some  rather  crude 
and  heuristic  turbulence  models,  success 
.  has  been  achieved  even  for  complex  turbu¬ 
lent  flowB^-S.  In  spite  of  all  these  con¬ 
vincing  demonstrations,  the  objective  of  a 


n,  V,  V 


E,  D,  I 


wide  application  of  computational  fluid 
dynamics  in  engineering  design  has  yet  to 
be  achieved.  The  basic  limitation  is  In 
cost  effectiveness.  A  lower  cost  and 
systematic  methodology  needs  to  be 
developed^ . 

The  present  analysis  addresses  one  of 
the  key  objectives  in  obtaining  efficient 
_;numerlcal  processing.  To  achieve  this 
objective,  two  approaches  seen  obvious; 
either  develop  special  algorithms  designed 
for  a  particular  category  of  problems 
according  to  the  laws  of  physics  or  utilize 
an  improved  computer.  In  the  case  of 
special  algorithms,  a  better  understanding 
of  the  generic  structure  of  the  flow  field 
is  required.  In  general,  these  attempts 
have  been  successful  and  have  achieved  an 
order  of  magnitude  Improvement  in  comput¬ 
ing  speed.  On  the  other  hand,  a  class  of 
computers  designed  for  scientific  computa¬ 
tions;  the  CRAY-1,  STAR  100  and  ILLIAC  IV 
among  others,  has  become  available.  The 
most  significant  advance  in  computer  hard¬ 
ware  related  to  computational  fluid  dyna¬ 
mics  is  the  vector  processor  which  permits 
a  vector  to  be  processed  at  an  exceptional 
speed.  This  option  gives  a  new  perspec¬ 
tive;  l.e.,  a  drastic  reduction  in  comput- 
,  lng  time^9* 

A  three-dimensional  time  dependent 
Navler-Stokcs  code  using  MacCormack's 
explicit  8chemel3  has  been  vectorized  for 
the  CRAY-1  computer.  The  selection  of 
this  particular  finite  differencing  scheme 
is  based  on  its  past  ability  to  perform  a 
large  number  of  successful  bench  mark 
runs^-?^  ItB  proven  shock-capturing  capa- 
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'  Mllty,  and  the  inherent  simplicity  of  the 
l>asic  algorithms.  The  Cray-1  computer  was 
-chosen  because  at  the  present  time,  among 
all  the  available  general  purpose  scienti- 
.fic  processors,  it  provides  the  highest 

-  potential  floating  point  computation  race 
in  both  the  scalar  and  the  vector  mode^^. 
The  combination  of  Che  selected  algorithm 
.and  the  CRAY-1  computer  provides  a  bench 
.mark  for  future  development  and  a  tool  for 

current  engineering  evalucion. 

The  problem  selected  for  evaluating  Che 
CRAY-l  performance  was  the  experimental 
.  investigation  of  Abbissl5,16  of  a  three- 
.dlmensional  interaction  of  a  normal  shock 
with  a  turbulent  boundary  layer  in  a  square 
-wind  tunnel  diffuser  at  a  Reynolds  number 
.Of  thirty  million  and  Mach  number  of  1. 51. 
The  primary  purpose  of  the  paper  is  to 
.-’determine  the  computational  speed  of  Che 
•code,  although  a  comparison  with  experi¬ 
mental  data  is  presented  to  demonstrate 
...the  validity  of  the  solution. 

GOVERNING  EQUATIONS 

The  time  dependent,  three  dimensional 
ccmpressible  Navler-Stokes  equations  in 
.  mass-averaged  variables  can  be  given  as 

,  If  +  V  •  (pG)  -  0  (1) 

If^  +  V  (pGG  -  T)  -  0  (2) 

^f^  +  V  •  (peu  -  u  ♦  T  +  q)  ••  0  (3) 

The  turbulent  closure  of  the  present 
-analysis  is  accomplished  through  an  eddy 
-•viscosity  model.  The  effective  thermal 
conductivity  is  also  defined  by  the  turbu¬ 
lent  Frandtl  number  (Prt  =  0.9).  The  equa- 

-  :tion  of  state,  Sutherland's  viscosity  law 
and  assigned  molecular  Frandtl  number 
•(0.73)  formally  close  the  system  of  govern¬ 
ing  equations. 


Figure  1.  Flow  Field  Schematic 


-  Since  -the  wind  tunnel  flow  field  con- 
■sisced  of  four  symmetrical  quadrants,  only 
a  single  quadrant  was  computed.  The  boun¬ 
daries  of  Che  computational  domain  contain 
two  intersecting  wind  tunnel  walls  and  two 
planes  of  symmetry  for  which  the  associated 
boundary  conditions  are  .straight  forward 
(Figure  1} .  In  order  to  develop  upstream 
conations  equivalent  to  the  experiment  a 
separate  computation  is  initiated  with  a 
free  scream  condition  and  permitted  to 
develop  a  three-dimensional  boundary 
layer  along  Che  comer  region  until  the 
boundary  layer  duplicates  Che  experimental 
observation  (6  “  4.0  cm,  x  •  316  cm)!^. 
Then,  the  computed  flow  field  at  this 
streamuise  location  is  imposed  as  the  up- 
steam  condition  for  Che  interaction  com¬ 
putation.  On  Che  wind  tunnel  walls,  the 
boundary  conditions  are  no-slip  for  the 
.velocity  components  and  a  constant  sur¬ 
face  temperature.  The  wind  tunnel  wall 
pressure  is  obtained  by  satisfying  Che 
momentum  equation  at  Che  solid  surface. 

On  the  planes  of  symmetry,  the  symmetrical 
boundary  conditions  are  given  for  all  de¬ 
pendent  variables.  The  normal  shock  wave 
across  the  wind  tunnel  is  then  specified 
according  to  the  Rankine-Hugoniot  condi¬ 
tions.  The  far  downstream  boundary  con¬ 
dition  is  the  well  known  no-change  con¬ 
dition.  In  summary: 

miTIAl.  CONDITION; 

5(0,  c,  n,  O  -  u,  (9) 

PFSTREAM  CONDITION; 

5(t,  0,-ri,  C)  -  U„  (10) 

DOWNSTREAM  : 


ON  PLANES  OF  SYMMETRY; 

-0  and  PI 
_  3t 


ON  WIND  TUNNEL  WALL: 


V  “  w  “  0 
■  313.79'K 


at  y,  z 


A  coordinate  system  transformation  is 
-introduced  Co  Improve  Che  numerical  resol- 
ntioD  in  the  viscous  dominated  region. 

K  ■  (14a) 

’  T|  -  1/k  Inll  +(c’‘  -  1)  y/yj^J  (14b) 

-  1/k  Inll  +(e'‘  -  1)  (14c) 


2 


The  governing  equations  in  tbe  transformed 
space  are  of  the  foUowing  form: 


^  r  a.  T  «  3H  . 


where  £ 


Tly  and 


are  the  metrics  of  the 


coordinate  transformation.  Hie  definition 
of  the  conventional  flux  vectors  F,  G,  and 
H  can  be  fotmd  in  Ref.  7. 


ITOMERICAL  PROCEDURE 
AND  DATA  STRUCTURE 


The  basic  numerical  method  is  the 
time-split  or  factorized  scheme  originated 
by  MacCormack.  The  finite  difference  form- 
-olation  in  terms  of  the  difference  operator 
can  be  expressed  as 

'  SS<s' 


(16) 


Each  difference  operator  contains  a 
predictor  and  corrector.  During  a  specific 
numerical  sweep,  the  flux  vectors  are  appro¬ 
ximated  by  a  central,  forwaid,  and  back¬ 
ward  differencing  scheme  in  such  a  fashion 
that  after  a  complete  cycle  of  the  pre¬ 
dictor  and  corrector  operations  all  the 
derivatives  are  effectively  approximated 
by  a  central  differencing  scheme.  A 
graphic  representation  of  these  operations 
•is  given  by  Figure  2. 


IPlgure  2. 


Grid  Points  Involved  in  the; 
Time  Step  Sweep 


-Nhen  investigating  flows  with  strong 
.diock  waves,  it  is  necessary  to  employ 
rmmerical  damping  in  a  shock-capturing 
-Scheme.  Fourth-order  pressure  damping  was 
utilized  which  generates  an  artificial 
viscoslty-likc  term. 17 
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The  approximation  of  second  order  central 
differencing  for  the  corrector  step 
required  additional  grid  point  Information 
beyond  the  immediately  adjacent  planes. 

The  damping  terms,  however,  are  effective 
only  in  the  presence  of  shock  waves  where 
the  numerical  resolution  is  degraded. 

From  the  synmetrlc  differencing  opera¬ 
tor  sequence  of  predictor  and  corrector 
steps,  one  detects  that  the  dependent  vari¬ 
ables  in  the  predictor  level  can  be  com¬ 
pletely  eliminated  by  retaining  only  the 
three  cyclic  pages  currently  in  use 
(Figure  3).  For  a  flow  field  requiring  a 
large  amount  of  data  storage,  this  reduc¬ 
tion  in  memory  requirement  is  substantial. 
Meanvrhile,  the  paging  process  is  reduced 
from  two  sweeps  to  one.  The  predictor 
and  corrector  sequence  is  performed  with¬ 
in  one  sweep  by  overlapping  the  corrector 
•operation  during  one  fractional  time  step. 


Figure  3.  Data  Storage  and  Data 
Flow  Diagram 

Once  the  planar  or  page  storage  is 
adopted,  the  vector  length  can  be  deter¬ 
mined.  Separate  vectors  are  constructed 
for  n  and  ?  directions,  yielding  vector 
lengths  approximately  equal  to  the  number 
of  grid  points  in  each  direction.  In 
-order  to  keep  all  solutions  in  the  same 
page  (n  -  C  plane)',  the  streamwise  sweep 
sweep)  is  vectorized  in  the  ^  direction. 

For  the  present  problem,  the  computa¬ 
tional  domain  with  the  dimension  of 
356.3cm  x  45.3cm  x  45.5cm  is  partitioned 
into  two  streamwise  sections  of  64  pages 
-each.  Every  page  contains  33  x  33  grid 
points  in  n  and  C  coordinates  respectively. 
The  problem  is  solved  in  two  steps.  The 
first  computational  section  generates  a 
three-dimensional  boundary  layer  over  a 
comer  which  becomes  the  In-flow  boundary 
-condition  for  the  following  shock-boundary 
interaction  domain.  Both  contain 
64  X  33  X  33  grid'  points,  but  a  finer 
streamwise  mesh  spacing  Ax  •  1.27  cm  was 
-used  for  the  interaction  zone  to  gain  a 
finer  numerical  resolution  of  the  shock- 
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'boundary  layer  interaction.  The  ratio 
between  the  fine  and  coarse  streamwise 
grid  spacing  is  0.3063  of  the  local 
boundary-layer  thickness  (4.0cn)l^.  The 
cross  flow  plane  grid-point  distribution, 
hcwerer,  remains  identifical  between  the 
two  overlapping  segments.  The  memory 
requirement  for  each  is  about  0.545  mil¬ 
lion  words. 

The  numerical  solution  is  considered 
at  its  steady  state  asymptote  when  the 
»ATri™nn  difference  between  two  consecutive 
time  levels  of  the  static  pressure  in  the 
-  Strong  Interacting  zone  Is  less  than  0.2 
percent.  In  the  leading  computational 
dmaain  the  convergence  criterion  is 
.  established  similarly  but  is  based  on  the 
■velocity  profiles  Instead  of  pressure. 

TIMII9G  BESDLTS 

A  portion  of  the  present  effort  is 
..aimed  at  making  internal  comparisons  of 
tbe  relative  times  tor  various  types  of 
functional  unit  processing  and  memory 
loading  (I/O)  for  the  vectorized  code. 

A  knowledge  of  relative  time  expenditure 
information  is  important  to  provide  some 
insight  iato  Che  program  execution  rate. 
Although  this  type  of  data  is  code  depend¬ 
ent,  the  present  example  is  deemed  typical 
of  a  large  class  of  Kavier-Stokes  solvers. 

The  timing  information  is  measured  by  vec¬ 
tor  operation  counts^  and  shown  in  Figure 
4.  .It  is  obvious  that  the  relative  usage 
of  the  memory  path  and  functional  units  is 
dominated  by  memory  loadings  (34.6%)  and 
floating  point  multiplication  (33.3%). 
mtfain  tbe  functional  units,  the  relative 
"usage  of  the  floating  point  addition  and 
multiplication  has  t.he  ratio  of  two  to 
three.  The  relative  usage  of  the  recipro¬ 
cal  approximation  is  extremely  rare,  l.e. 
less  the  2%.  In  spite  of  the  high  per¬ 
centage  of  memory  loading,  a  portion  of 
the  vectorized  Fortran  code  has  achieved 
an  execution  rate  of  42.9  MFLOPSll. 

Fhrther  improvements  still  can  be  made 
either  in  Fortran  or  assembly  language 
versions  of  the  present  code.  However,  we 
feel  an  overall  execution  rate  greater 
-than  60  HFIOPS  on  this  size  problem  is 
unlikely. 

A  baSlc  dilemma  exists  for  the  com¬ 
parative  investigation;  namely  in  the  pro¬ 
cess  of  vcctorization  significant  changes 
were  made  either  on  the  amount  of  computa¬ 
tion  performed  or  on  the  number  of  sub¬ 
routine  calls  made.  The  final  vectorized 
program  usually  bears  little  rcscmblence 
to  the  original  scalar  code7,  11.  Sub- 

;  '  « 
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stantial  Improvement  in  performance  of 
the  vectorized  code  on  a  scalar  machine 
has  also  been  reported.  However,  this 
Improvement  in  performance  can  be  consid¬ 
ered  as  a  contribution  due  to  the  vector- 
Izatlon  process. 


Figure  4.  Vector  Operation  Counts  in 
Percentage  „ 

In  order  to  perform  the  comparative 
study,  a  criterion  must  be  established. 

The  ultimate  evaluation  of  data  processing 
rate  is  the  computing  tine.  The  complete¬ 
ly  duplicated  computations  for  an  ident¬ 
ical  fluid  mechanics  problem  are  usually 
prohibited  by  the  incote  memory  and  the 
indexing  limitations  for:  various  processors. 
Therefore,  one  has  to  accept  the  rate  of 
data  processing  as  the  criterion.  .  The 
rate  of  data  processing  is  commonly  defined 
as 

RDP  =  CPU  Time/ (Total  Number  of  Grid 
Points  X  Total  Number  of  Iterations) 

The  particular  rate  of  data  processing  is 
most  suitable  for  numerical  programs  with 
similar  algorithms  and  convergence  rate. 

If  the  ratio  between  field  grid  points 
and  boundary  points  can  be  maintained- 
between  two  programs  then  the  comparison 
is  particularly  meaningful. 

In  Tabl^  1,  the  comparison  of  timing 
results  between  the  scalar  code  and 
vectorized  code  on  the  CRAY-1  is  presented. 

Table  1 

The  Comparison  of  Scalar  and 
Vector  Processing  on  CRAY-1  _  . 

VERSION  OF  CODE  RDP(Sec/Pts,  ITERATIONS) 
Scalar  4.761  x  10  ^ 

Vector  4.861  x  10  ^ 


The  vectorized  program  outperforms  the 
original  scalar  code  by  a  factor  of  8.13. 
In  Tabic  2,  the  timing  results  of  the 
scalar  code  and  vectorized  code  perform- 
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ance  for  four  different  computers  are 
given. 

Table  2 

Comparative  Timing  Results 

<”°^WeR  7A 
RDP 


CYBER  74 

Scalar 

7.48x10“^ 

1.0 

CDC  7600 

Scalar 

1.45x10"^ 

5.2 

CRAY-1 

Scalar 

4.76x10“^ 

15.7 

CRAY-1 

Vector 

5,86x10"^ 

127.7 

CRAY-1' 

Assembly  5.19x10  ^ 

144.2 

A  brief  description  of  each  running 
condition  for  vhich  the  timing  results 
vere  obtained  may  help  %ri.th  the  Interpre- 
■  tlon  of  the  data.  The  computations  con¬ 
ducted  on  CYBER  74. and  CDC  7600  with  a 
grid  point  system  of  (17  x  33  x  33)  were 
performed  in  the  early  phase  of  the  pre¬ 
sent  task^.  On  the  CYBER  74  computer  the 
data  storage  problem  was  overcome  by  a 
.  data  manager  subroutine  in  conjunction  -with 
a  random  access  disk  file.  The  computation 
carried  out  on  CDC  7600  used  large  core 
.memory  for  all  the  dependent  variables. 

The  1/0  requirement  is  substantial,  par¬ 
ticularly  for  the  computation  performed  on 
-  the  CYBER  74. 


FORTRAN  VS.  ASSEMBLY  LANGUAGE 

The  multiple  functional  units  and  mem¬ 
ory  hleracrchy  of  the  CRAY-1  can  be  dif¬ 
ficult  for  the  Fortran  compiler  (CFT)  to 
manage  efficiently.  Consequently,  CRAY 
Assembly  Language  (CAL)  versions  of  a  num¬ 
ber  of  subroutines  vhich  account  for  up  to 
78Z  of  the  computation  time  were  written 
With  the  aid  of  a  simulator  118].  These 
kernels  were  also  vectorized  in  Fortran 
with  the  CRAY-1  architecture  and  compiler 
features  in  mind;  however,  non-ANSI  stan¬ 
dard  utility  functions  [19]  and  unusual 
Fortran  constructs  [20]  were  not  employed. 
The  principle  timing  results  follow. 

1)  Among  9  kernels,  a6seid>ly  language 
speedups  ranged  from  IIZ  to  29Z  with 
vector  lengths  of  33  (•  a  grid  dimen¬ 
sion)  . 

2)  An  overall  speedup  of  14. 2Z  was 
achieved  (Table  2);  including  the  ' 
connon  222  Fortran. 


3)  A  detailed  simulator-produced  evalua¬ 
tion  of  a  subroutine  which  accounts 
for  -  20%  of  the  total  computation 
‘  time  is  given  in  [21],  The  execution 

rate  of  =  50  MFLOPS  is  1/3  of  the 
maximum  practical  rate  of  the  proces¬ 
sor.  However,  the  memory  path  is 
busy  70%  of  the  time  for  the  Fortran 
code  for  a  vector  length  of  63,  and 
up  to  90%  for  the  CAL  code,  indicating 
'  the  memory  bound  nature  of  the  algor- 
,  ithm  on  the  CRAY-1,  Indeed,  the  90% 
busy  time  is  viewed  as  an  excellent 
.  indicator  of  the  optimality  of  the 
CAL  code , 

A  more  detailed  comparative  study  of  this 
code  is  given  in  [21]. 

COMPARISONS  WITH  EXPERIMENTAL  DATA 

In  Figure  5,  several  velocity  profiles 
across  the  wind  tunnel  at  a  Reynolds  number 
of  3.0  X  107  are  presented.  This  location 
represents  the  flow  field  condition  at  the 
end  of’  the  leading  segment  of  the  computa¬ 
tional  domain  which  is  also  the  upstream 
condition  for  the  following  interaction 
zone.  The  present  results  agree  reasonably 
well  with  the  data  of  Seddon^^,  The  data, 
however,  were  collected  at  a  Reynolds  num¬ 
ber  one  decade  lower  than  the  present  con¬ 
dition  and  at  a  slightly  different  Mach 
number  (1.47  v.s.  1.51).  At  the  range  of 
Reynolds  numbers  considered,  the  Reynolds 
number  dependence  should  be  scaled  out  by 
the  boundary  layer  thickness.  An  inde- 
-pendent  boundary-layer  calculation  using 
the  exact  simulated  condition  was  per¬ 
formed  that  verified  this  contention.  It 
■was  found  that  the  difference  in  magnitude 
of  velocity  is  a  few  percent.  The  present 
result  undcrpredicts  the  measured  boundary 
layer  thlcknessl^  by  about  eight  percent 

«_>ui  m.ui*' 


Figure  5.  Velocity  Profiles  Along 
the  Tunnel  Wall 
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A  direct  comparison  of  several  velocity 
distributions  between  the  data  of  Abblss 
et  al^^  and  the  present  calculation  is  ' 
presented  in  Figure  6  for  the  Interaction 
region.  The  data  are  displayed  for  fixed 
x/6  and  y  coordinates  away  from  the  corner 
domain.  The  coordinate  x  is  taken  in  the 
streamwise  direction  along  the  tunnel  floor 
and  y  normal  to  the  floor.  Excellent  agree¬ 
ment  between  the  data  and  calculation  is 
observed  for  the  regions  either  deeply 
.l]id>edded  within  the  boundary  layer  or  com¬ 
pletely  contained  in  the  inviscid  domain. 

The  maximum  discrepancy  between  data  and 
calculation  is  in  the  lambda  wave  structure. 
The  maximum  desparity  between  data  and 
calculations  is  about  10  percent. 


•  •  . 

'Figure  6.  Comparison  of  the  Flow  Field 
Velocity  in  the  Interactive 
Region 

In  Figure  7  the  Mach  number  contour  is 
presented  in  an  attempt  to  compare  with 
the  flow  field  structure  given  by  Abblss 
et  all5  in  Figure  8.  The  bifurcation  of 
the  normal  shock  wave  is  clearly  indicated. 
The  calculation  nearly  duplicates  all  of 
primary  features  of  the  experimental 
observation.  However,  a  difference  can 
be  discerned  in  the  dimension  of  the 
embedded  supersonic  zone  between  the 
.experimental  observation  and  calculation. 
The  local  supersonic  zone  emanates  from 
'the  expansion  due  to  the  total  pressure 
'difference  between  the  normal  shock  and 
the  lambda  shock  structure  and  the  rapid 
-change  in  the  displacement  surface.  A 
few  percent  disparity  in  predicting  the 
.magnitude  of  velocity  lead  to  the  dis- 
■tlnguishable  discrepancy  in  the  definition 
of  the  embedded  supersonic  zone.  A 
.similar  observation  may  be  made  for  the 
work  of  Shca^^  in  his  investigation  of  the 
two-dimensional  normal-shock  wave  turbu¬ 
lent  boundary  layer  interaction. 


Figure  7.  Experimentally  Measured  Flow 
Field  Structure  in  the  Plane 
of  Symmetry 
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Figure  8,  Computed  Number  Contour  in  the 
Plane  of  Symmetry 


In  Figure  9,  the  velocity  distribution 
parallel  to  the  wind  tunnel  side  is  given. 

A  reverse  flow  is  obser\’ed  beneath  the 
lambda  shock  wave  system.  The  separated 
flow  region  begins  about  three  boundary- 
layer  thickness  upstream  of  the  normal 
shock  and  terminates  at  five  boundary  layer 
thickness  downstream.  The  length  of  the 
separated  domain  is  similar  to  the  measure¬ 
ment  of  Seddon^^  and  the  numerical  simula¬ 
tion  by  Shea?^ . 
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-Figure  9.  Computed  Velocity  Field  in  the 
Interaction  Region 

The  entire  flow  field  structure  is  pre¬ 
sented  in  Figure  10  in  terms  of  density  con¬ 
tours  at  various  streamwise  locations.  The 
shear  layer  over  the  corner  region,  the 
.strong  invlscid-viscous  domain,  and  the 
subsequent  readjustment  of  the  flow  field 
are  easily  detectable.  A  clear  indication 
of  substantial  growth  of  the  shear  layer 
over  the  wind  tunnel  wall  is  also  obvious. 

CONCLUSIONS 

A  three-dimensional  time  dependent 
Navler-Stokes  code  using  MacCormack's 
explicit  scheme  has  been  vectorized  for 


the  CRAY-1  computer  achieved  a  speed  of 
128  time  that  of  the  original  scalar  code 
processed  by  a  CYBER  74  computer.  The 
vectorized  code  outperforms  the  scalar 
code  on  the  CRAY-1  computer  by  a  factor  of 
8.13. 

The  numerical  simulation  for  a  turbu¬ 
lent,  transonic,  normal  shock-wave  bound¬ 
ary-layer  interaction  in  a  wind  tunnel  has 
been  successfully  performed  using  a  total 
139,400  grid  points.  The  numerical  result 
indicates  sufficient  resolution  for  engine¬ 
ering  purposes.  Additional  increase  in 
speed  by  up  to  an  order  of  magnitude 
.  through  algorithm  requirement  also  seems 
attainable. 
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Figure  10.  Perspective  View  of  Density  Contours 
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